173 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
American English
Availability:
From Data Center(s)
License:
ICAME
Size:
approx. 1,000,000 <Not Specified>Production Status:
Existing-used
Use:
Diachronic studies
-
Paper title:Diachronic Changes in Text Complexity in 20th Century English Language: An NLP Approach
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Sanja Štajner | University of Wolverhampton | None |
| Author 2 | Ruslan Mitkov | University of Wolverhampton | None |
| Main Contact | Sanja Stajner | University of Wolverhampton | GB |
Documentation:
http://khnt.aksis.uib.no/icame/manuals/brown/INDEX.HTM
Written
Language Resources/Technologies Infrastructure,
Language Type:
Multilingual
Languages:
American English
Availability:
From Owner
License:
<Not Specified>
Size:
14 GByte Production Status:
Existing-used
Use:
Web Services
-
Paper title:Facing the Identification Problem in Language-Related Scientific Data Analysis.
-
Paper track:Infrastructural Issues/Large Projects
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Joseph Mariani | LIMSI-CNRS | FR | LIMSI-CNRS & IMMI | FR |
| Author 2 | Christopher Cieri | LDC | US | ||
| Author 3 | Gil Francopoulo | Tagmatica + IMMI-CNRS | FR | ||
| Author 4 | Patrick Paroubek | LIMSI-CNRS | FR | ||
| Author 5 | Marine Delaborde | LIMSI-CNRS | FR | ||
| Main Contact | Joseph Mariani | LIMSI-CNRS | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
American English English
Availability:
Freely Available
License:
CreativeCommons
Size:
3470 annotated questions with answers OtherProduction Status:
Newly created-finished
Use:
Question Answering
-
Paper title:A Study on Expert Sourcing Enterprise Question Collection and Classification
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Yuan Luo | MIT, IBM | US |
| Author 2 | Thomas Boucher | UMass CS, IBM | US |
| Author 3 | Tolga Oral | IBM | US |
| Author 4 | David Osofsky | IBM | US |
| Author 5 | Sara Weber | IBM | US |
| Main Contact | Yuan Luo | MIT, IBM | None |
Documentation:
Annotation Guideline, English, publicly availableLanguage Type:
Multilingual
Languages:
American English
Availability:
Not Available
License:
<Not Specified>
Size:
220 Mbytes <Not Specified>Production Status:
Newly created-in progress
Use:
Speech Synthesis
-
Paper title:Practical Evaluation of Human and Synthesized Speech for Virtual Human Dialogue Systems
-
Paper track:Evaluation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Kallirroi Georgila | <Not Specified> | None |
| Author 2 | Alan Black | <Not Specified> | None |
| Author 3 | Kenji Sagae | <Not Specified> | None |
| Author 4 | David Traum | <Not Specified> | None |
| Main Contact | Kallirroi Georgila | University of Southern California | US |
Documentation:
No DocumentationLanguage Type:
Multilingual
Languages:
American English German
Availability:
can be crawled, corpus is not published due to copyright issues
License:
<Not Specified>
Size:
<Not Specified> <Not Specified>Production Status:
Newly created-in progress
Use:
Information Extraction, Information Retrieval
-
Paper title:Textual Characteristics for Language Engineering
-
Paper track:General issues
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Mathias Bank | <Not Specified> | None | ||
| Author 2 | Robert Remus | <Not Specified> | None | Universität Leipzig | None |
| Author 3 | Martin Schierle | <Not Specified> | None | ||
| Main Contact | Mathias Bank | Pattern Science AG | DE |
Documentation:
<Not Specified>Language Type:
Trilingual
Languages:
American English Mandarin Chinese Urdu
Availability:
From Owner
License:
<Not Specified>
Size:
2.3 <Not Specified>Production Status:
Newly created-in progress
Use:
Language Modelling
-
Paper title:Extending the MPC corpus to Chinese and Urdu - A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Ting Liu | <Not Specified> | None |
| Author 2 | Samira Shaikh | <Not Specified> | None |
| Author 3 | Tomek Strzalkowski | <Not Specified> | None |
| Author 4 | Aaron Broadwell | <Not Specified> | None |
| Author 5 | Jennifer Stromer-Galley | <Not Specified> | None |
| Author 6 | Sarah Taylor | <Not Specified> | None |
| Author 7 | Umit Boz | <Not Specified> | None |
| Author 8 | Xiaoai Ren | <Not Specified> | None |
| Author 9 | Jingsi Wu | <Not Specified> | None |
| Main Contact | Ting Liu | ILS, University at Albany | US |
Documentation:
<Not Specified>
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
American English
Availability:
Freely Available
License:
OpenSource
Size:
100000 <Not Specified>Production Status:
Newly created-in progress
Use:
Word Sense Disambiguation
-
Paper title:The MASC Word Sense Corpus
-
Paper track:General issues
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Rebecca Passonneau | Columbia University | None | ||
| Author 2 | Collin Baker | ICSI | None | International Computer Science Institute | None |
| Author 3 | Christiane Fellbaum | Princeton University | None | ||
| Author 4 | Nancy Ide | Vassar College | None | ||
| Main Contact | Rebecca Passonneau | Columbia University | US |
Documentation:
Documentation in English with downloads and at website
Written
Corpus,
Language Type:
Multilingual
Languages:
American English
Availability:
From Owner
License:
<Not Specified>
Size:
1467 questions OtherProduction Status:
Newly created-in progress
Use:
Question Answering
-
Paper title:Annotating Question Decomposition on Complex Medical Questions
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Kirk Roberts | National Library of Medicine | US | University of Texas Health Science Center at Houston | US |
| Author 2 | Kate Masterton | National Library of Medicine | US | National Library of Medicine | None |
| Author 3 | Marcelo Fiszman | National Library of Medicine | US | ||
| Author 4 | Halil Kilicoglu | National Library of Medicine | US | ||
| Author 5 | Dina Demner-Fushman | NLM | US | ||
| Main Contact | Kirk Roberts | National Library of Medicine | None | University of Texas Health Science Center at Houston | None |
Documentation:
Publicly available at resource URL.
Written
Tagger/Parser,
Language Type:
Multilingual
Languages:
American English
Availability:
Freely Available
License:
Illinois Open Source License
Size:
96 <Not Specified>Production Status:
Existing-used
Use:
NLP Tool provided as service
-
Paper title:An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines)
-
Paper track:General issues
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | James Clarke | University of Illinois (Urbana-Champaign) | None |
| Author 2 | Vivek Srikumar | University of Illinois (Urbana-Champaign) | None |
| Author 3 | Mark Sammons | University of Illinois (Urbana-Champaign) | None |
| Author 4 | Dan Roth | University of Illinois (Urbana-Champaign) | None |
| Main Contact | Mark Sammons | University of Illinois | US |
Documentation:
http://cogcomp.cs.illinois.edu/page/software_view/3 (overview) + link to Javadoc, English
Written
Corpus,
Language Type:
Multilingual
Languages:
American English
Availability:
From Owner
License:
none
Size:
390,000 forum posts OtherProduction Status:
Newly created-in progress
Use:
Dialogue
-
Paper title:A Corpus for Research on Deliberation and Debate
-
Paper track:Evaluation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||||
|---|---|---|---|---|---|---|---|
| Author 1 | Marilyn Walker | <Not Specified> | None | UCSC UARC | None | ||
| Author 2 | Jean Fox Tree | <Not Specified> | None | University of California, Santa Cruz | US | University of California, Santa Cruz | N/A |
| Author 3 | Pranav Anand | <Not Specified> | None | UC Santa Cruz | US | ||
| Author 4 | Rob Abbott | <Not Specified> | None | ||||
| Author 5 | Joseph King | <Not Specified> | None | ||||
| Main Contact | Marilyn Walker | UCSC | US |
Documentation:
English documentation included




